Name | Version | Summary | date |
pdf-searchable-ocr |
0.1.1 |
A simple Python package for OCR with searchable PDF generation using PaddleOCR |
2025-10-09 12:39:52 |
quanta-pdf |
1.0.2 |
Advanced PDF layout analysis engine for extracting figures, tables, and structured content |
2025-10-09 09:35:27 |
kiwi-pdf-chunker |
0.3.3 |
A tool for parsing PDF document layouts and chunking content |
2025-10-08 10:31:42 |
privision |
1.0.1 |
视频内容脱敏工具 - 基于OCR的信息识别与打码系统 |
2025-10-07 17:32:48 |
doc2mark |
0.4.1 |
Unified document processing with AI-powered OCR |
2025-10-07 05:09:42 |
kreuzberg |
3.15.0 |
Document intelligence framework for Python - Extract text, metadata, and structured data from diverse file formats |
2025-09-14 18:14:57 |
sparrow-parse |
1.1.3 |
Sparrow Parse is a Python package (part of Sparrow) for parsing and extracting information from documents. |
2025-09-14 14:00:15 |
doctra |
0.3.3 |
Parse, extract, and analyze documents with ease |
2025-09-14 11:18:55 |
pdf2markdown |
0.3.0 |
Python library and CLI tool that leverages LLMs to convert technical PDF documents to well-structured Markdown |
2025-09-14 02:02:58 |
docstrange |
1.1.6 |
Extract and Convert PDF, Word, PowerPoint, Excel, images, URLs into multiple formats (Markdown, JSON, CSV, HTML) with intelligent content extraction and advanced OCR. |
2025-09-10 09:27:30 |
docling-onnx-models |
0.1.3 |
ONNX Runtime implementations for Docling AI models |
2025-09-09 08:45:47 |
mseep-kreuzberg |
3.13.4 |
Document intelligence framework for Python - Extract text, metadata, and structured data from diverse file formats |
2025-09-09 03:44:56 |
bot-vision-suite |
1.1.1 |
Biblioteca Python avançada para automação de interface gráfica com OCR multi-técnica, detecção de imagens robusta e sistema de backtrack inteligente |
2025-09-08 11:12:22 |
dedoc |
2.5 |
Extract content and logical tree structure from textual documents |
2025-09-08 03:25:51 |
mcp-pdf |
1.0.1 |
Secure FastMCP server for comprehensive PDF processing - text extraction, OCR, table extraction, forms, annotations, and more |
2025-09-07 07:00:52 |
mpxpy |
0.0.18 |
Official Mathpix client for Python |
2025-09-05 17:43:50 |
sem-meta |
0.1.0 |
Unified interface for SEM image processing: metadata extraction, OCR-based pixel size estimation, and unit conversion |
2025-09-05 15:12:45 |
marker-pdf |
1.9.2 |
Convert documents to markdown with high speed and accuracy. |
2025-09-04 18:45:56 |
docu-devs-api-client |
1.0.8 |
A client library for accessing DocuDevs API |
2025-09-04 15:25:43 |
docuglean-ocr |
1.0.0 |
An SDK for intelligent document processing using SOTA VLLM models |
2025-09-02 13:19:12 |